# Mixture of Experts
Qwen3 235B A22B GPTQ Int4
Apache-2.0
Qwen3 is the latest generation of large language models in the Qwen series, offering a range of dense and mixture-of-experts (MoE) models. Through extensive training, Qwen3 has achieved groundbreaking progress in reasoning, instruction following, agent capabilities, and multilingual support.
Large Language Model
Transformers

Q
Qwen
1,563
9
Qwen3 235B A22B FP8 Dynamic
Apache-2.0
The FP8 quantized version of the Qwen3-235B-A22B model, which effectively reduces GPU memory requirements and improves computational throughput, suitable for various natural language processing scenarios.
Large Language Model
Transformers

Q
RedHatAI
2,198
2
Qwen3 30B A3B 128K GGUF
Apache-2.0
Qwen3 is the latest generation of large language models in the Tongyi Qianwen series, offering a complete system of dense and mixture-of-experts (MoE) models. Based on extensive training, Qwen3 achieves breakthrough progress in reasoning, instruction following, agent capabilities, and multilingual support.
Large Language Model English
Q
unsloth
48.68k
43
Qwen3 235B A22B 128K GGUF
Apache-2.0
Qwen3 is the latest generation large language model in the Tongyi Qianwen series, offering a complete suite of dense and Mixture of Experts (MoE) models. Based on large-scale training, Qwen3 has achieved breakthrough progress in reasoning, instruction following, agent capabilities, and multilingual support.
Large Language Model English
Q
unsloth
310.66k
26
Qwen3 235B A22B GGUF
Apache-2.0
Qwen3 is the latest generation of large language models in the Qwen series, offering a range of dense and mixture of experts (MoE) models. Based on extensive training, Qwen3 has achieved breakthrough progress in reasoning, instruction following, agent capabilities, and multilingual support.
Large Language Model English
Q
unsloth
75.02k
48
Mmrexcev GRPO V0.420
This is a pre-trained language model merged using the SLERP method, combining the characteristics of both Captain-Eris_Violet-GRPO-v0.420 and MMR-E1 models.
Large Language Model
Transformers

M
Nitral-Archive
35
2
Llama4some SOVL 4x8B L3 V1
This is a Mixture of Experts model obtained by merging multiple pre-trained language models using mergekit, aiming to create the most unconstrained text generation capability.
Large Language Model
Transformers

L
saishf
22
3
Chicka Mixtral 3x7b
MIT
A Mixture of Experts large language model based on 3 Mistral architecture models, excelling in dialogue, code, and mathematical tasks
Large Language Model
Transformers

C
Chickaboo
77
3
Llama 3 Smaug 8B GGUF
GGUF format quantized model based on abacusai/Llama-3-Smaug-8B, supporting 2-8 bit quantization levels, suitable for text generation tasks
Large Language Model
L
MaziyarPanahi
8,904
5
Copus 2x8B
Copus-2x8B is a Mixture of Experts model based on the Llama-3-8B architecture, combining fine-tuned versions of dreamgen/opus-v1.2-llama-3-8b and NousResearch/Meta-Llama-3-8B-Instruct.
Large Language Model
Transformers

C
lodrick-the-lafted
14
1
Wizardlm 2 8x22B
Apache-2.0
WizardLM-2 8x22B is the state-of-the-art Mixture of Experts (MoE) model developed by Microsoft's WizardLM team, with significant performance improvements in complex dialogues, multilingual tasks, reasoning, and agent tasks.
Large Language Model
Transformers

W
dreamgen
28
31
Wizardlm 2 8x22B
Apache-2.0
WizardLM-2 8x22B is the next-generation state-of-the-art large language model developed by Microsoft AI, featuring a Mixture of Experts (MoE) architecture, excelling in complex dialogue, multilingual capabilities, reasoning, and agent tasks.
Large Language Model
Transformers

W
alpindale
974
400
Zephyr Orpo 141b A35b V0.1
Apache-2.0
Zephyr 141B-A39B is a large language model fine-tuned from Mixtral-8x22B-v0.1, trained using the ORPO alignment algorithm, designed to be a helpful assistant.
Large Language Model
Transformers

Z
HuggingFaceH4
3,382
267
Phalanx 512x460M MoE
Apache-2.0
LiteLlama-460M-1T is a lightweight mixture of experts model with 512 experts, suitable for efficient inference and text generation tasks.
Large Language Model
Transformers English

P
Kquant03
28
2
Beyonder 4x7B V2
Other
Beyonder-4x7B-v2 is a large language model based on the Mixture of Experts (MoE) architecture, consisting of 4 expert modules, each specializing in different domains such as dialogue, programming, creative writing, and mathematical reasoning.
Large Language Model
Transformers

B
mlabonne
758
130
Dolphin 2.7 Mixtral 8x7b AWQ
Apache-2.0
Dolphin 2.7 Mixtral 8X7B is a large language model based on the Mixtral architecture, focusing on code generation and instruction-following tasks.
Large Language Model
Transformers English

D
TheBloke
5,839
22
Dolphin 2.5 Mixtral 8x7b GPTQ
Apache-2.0
Dolphin 2.5 Mixtral 8X7B is a large language model developed by Eric Hartford based on the Mixtral architecture, fine-tuned on multiple high-quality datasets, suitable for various natural language processing tasks.
Large Language Model
Transformers English

D
TheBloke
164
112
Featured Recommended AI Models